Generic Processes to Use S3L ============================ We will show you two generic processes to start using S3L, which includes two parts: 1. Experiment Framework 2. Call Algorithms Directly Experiment Framework -------------------- We provide built-in experiment process for different semi-supervised settings with different input data such as inductive/transductive, WithGraph/WithoutGraph, givenDataSplit/randomlySplit and so on. The experiment class implements the following process: ``load data``, ``data split``, ``hyper-parameters search`` and `evaluate the selected model in testing data`. In order to accelerate the experiments, we also include multi-process with ``joblib``. The experiment framework allow you to evaluate supervised/semi-supervised learning algorithms in less than ten statements. Take an example, .. code:: python import sys import os from s3l.Experiments import SslExperimentsWithGraph from s3l.classification.LPA import LPA if __name__ == '__main__': configs = [ ('LPA', LPA(), { 'kernel': ['rbf'], 'n_neighbors':[3,5,7] }) ] datasets = [ ('ionosphere', None, None, None, None) ] # (name, feature_file, label_file, split_path, graph_file) experiments = SslExperimentsWithGraph(n_jobs=1) experiments.append_configs(configs) experiments.append_datasets(datasets) experiments.set_metric(performance_metric='accuracy_score') results = experiments.experiments_on_datasets( unlabel_ratio=0.75, test_ratio=0.2, number_init=4) # do something with results # The above codes evaluates ``Label Propagation`` algorithm on the built-in dataset ``ionosphere``. The best model is searched with ``rbf kernel`` and ``n_neighbors`` is in the range of [3, 5, 7]. Finally, the accuracy_score is reported in the local variable ``result``. Call Algorithms Directly ------------------------ The built-in algorithms can be called directly as in ``sklearn`` package. The algorithms we have implemented are listed `here `_. As long as reading the examples of certain algorithm in its module page, you can easily try out semi-supervised algorithm as you like. For example, .. code:: python import sys import os import numpy as np from s3l.classification.TSVM import TSVM from s3l.metrics.performance import accuracy_score from s3l.datasets import base, data_manipulate if __name__ == '__main__': datasets = [ ('house', None, None), ] for name, feature_file, label_file in datasets: # load dataset X, y = base.load_dataset(name, feature_file, label_file) # split _, _, labeled_idxs, unlabeled_idxs = \ data_manipulate.inductive_split(X=X, y=y, test_ratio=0., initial_label_rate=1 - unlabel_ratio, split_count=1, all_class=True) labeled_idx = labeled_idxs[0] unlabeled_idx = unlabeled_idxs[0] tsvm = TSVM() tsvm.fit(X, y, labeled_idx) pred = lead.predict(X[unlabeled_idx]) print("Accuracy_score: {}".format( accuracy_score(y[unlabeled_idx], pred))) The above code runs ``TSVM`` (Transductive Support Vector Machine) with default hyper-parameter settings given feature ``X``, label ``y`` and indexes of labeled data``labeled_idx``. Then, the prediction is evaluated with accuracy score on unlabeled data.